Corpus Annotation And Reference Resolution

نویسندگان

  • Tony McEnery
  • Izumi Tanaka
  • Simon Botley
چکیده

A variety of approaches to annotating reference in corpora have been adopted. This paper reviews four approaches to the annotation of reference in corpora. Following this we present a variety of results from one annotated corpus, the UCREL anaphoric treebank, relevant to automated reference resolution.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Creating a Novel Geolocation Corpus from Historical Texts

This paper describes the process of annotating a historical US civil war corpus with geographic reference. Reference annotations are given at two different textual scales: individual place names and documents. This is the first published corpus of its kind in document-level geolocation, and it has over 10,000 disambiguated toponyms, double the amount of any prior toponym corpus. We outline many...

متن کامل

Text Segmentation using Named Entity Recognition and Co-reference Resolution

In this paper we examine the benefit of performing named entity recognition (NER) and co-reference resolution to an English and a Greek corpus used for text segmentation. The aim here is to examine whether the combination of text segmentation and information extraction can be beneficial for the identification of the various topics that appear in a document. NER was performed manually in the Eng...

متن کامل

Corpus based coreference resolution for Farsi text

"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...

متن کامل

Co-reference annotation and resources: A multilingual corpus of typologically diverse languages

This article introduces a dialogue corpus containing data from two typologically different languages, Japanese and Kilivila. The corpus is annotated in accordance with language specific annotation schemes for co-referential and similar relations. The article describes the corpus data, the properties of language specific co-reference in the two languages and a methodology for its annotation. Exa...

متن کامل

The Use of Referential Constraints in Structuring Discourse

The quality of discourse structure annotations is negatively influenced by the numerous difficulties that occur in the analysis process. In contrast, referential annotation resources are considerably more reliable, given the high precision of the existent anaphora resolution systems. We present an approach based on the Veins Theory (Cristea, Ide, Romary, 1998), in which successful reference ann...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997